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Since the passage of the Clean Air Act in 1970, several epide¬ 
miological studies have attempted to associate morbidity 
with indoor and outdoor exposure to nitrogen dioxide (N0 2 ). 
The indoor, so-called gas stove studies 1 ' 7 produced mixed 
and inconclusive results in their attempts to link health 
impairment to the presence of a gas stove or gas heater in the 
home. Studies of the health effects of outdoor NO 2 expo¬ 
sures also have failed to find consistently significant health 
effects at ambient exposure levels .** 11 

In an analysis of people living in Chattanooga, Tennessee, 
conducted under the Community Health and Environmen¬ 
tal Surveillance System (CHESS) program, Shy and Love 13 
were able to link NO 3 exposures and acute respiratory dis¬ 
ease. However, several problems have been raised about this 
study. The researchers have been criticised for using rudi¬ 
mentary statistical techniques, consisting mainly of pairwise 
comparison of illness incidence rates in subpopulations, in 
addition, the data bass has been tainted by its association 
with the controversial CHESS program . 13 (The earlier 
CHESS Chattanooga studies were also criticized for using a 
subsequently discredited method (Jacobe-Hochbsiser) for 
monitoring NO* concentrations. However, by 1972 the Seitz- 
man technique was being used.] Yet EPA has found the 
Chattanooga data to be accurately transferred from the sur¬ 
veys to the computer tapes and our own research has re¬ 
vealed the data quality to be at least as high as other similar, 
hut much less controversial data bases. 

The lack of persuasive epidemiological studies upon which 
to base a national ambient air quality standard for nitrogen 
dioxide motivateo the present paper. Here we return to the 
CHESS aerometric and health data bases collected during 
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1972-73 in Chattanooga and used by Shy and Love to exam¬ 
ine the relationship between NO? and acute respiratory dis¬ 
ease in children. We first describe and defend the CHESS- 
Chattanooga data base and the statistical model used to 
examine it. We then present our results and discuss s num¬ 
ber of econometric issues and their relationship to our find¬ 
ings. 

The Database 

In January 1972, a self-administered survey on chronic 
respiratory disease (CRD) was distributed to families with 
children in elementary schools in one of the three Chatta¬ 
nooga communities. Harrison, Brainerd, and Red bank, lo¬ 
cated within one mile of an air pollution monitoring station. 
A subsample of families—1970 parents and their children, 
4898 individuals in all—was drawn from this sample to par¬ 
ticipate in an acute respiratory disease (ARD) panel survey. 
Information was taken in two-week intervals (always begin¬ 
ning on a Sunday) over three school semesters from spring 
1972 through spring 1973. Each family was phoned within 
several dtys after the end of each two-week period to deter¬ 
mine if any family members experienced various acute respi¬ 
ratory disease symptoms or consulted e physician. 

Aerometric data were gathered at seven sites. Hourly mea¬ 
surement* of NO* were taken using the Seltiman chemilu¬ 
minescence technique only for the fall 1972 and spring 1973 
study periods. Thus, we eliminated data for the spring 1972 
period from our analysis. Chattanooga was chosen as e site 
for an N0 2 study because it featured a TNT plant emitting 
Large quantities of nitrogen-based pollutants. This plant 
closed January 1,1973, resulting in reduced NOi concentra¬ 
tions in the nearby communities. Daily readings were taken 
on particulates, nitrates, and sulfates. These daily readings 
were reduced to monthly frequency distributions. Unfortu¬ 
nately, the original daily data were unavailable from EPA, 
and we have been forced to use the monthly frequency distri¬ 
butions for the Letter three pollutants. 

The Chattanooga health and aerometric data collection 
effort of the early 1970s and the CHESS program in general 
have been criticized (Roth 14 ) for their poor survey protocob, 
health data inconsistencies, and aerometric data unreliabili¬ 
ty. Knipnick and Harrington 13 provide e complete reanaly¬ 
sis of these data and find, first, that the survey protocob 
were carefully designed and observed. In addition, responses 
to identical sododemographic questions on the CRD and 
ARD surveys were found to be quite consistent. Abo, the 
NO 2 monitoring data were found to be reasonably complete, 
generally consistent, and taken by devices that generally 
outperform other types of monitors in the Lab. 

Further, because e duplicate CRD survey was adminis¬ 
tered to some of the participants 22 months later, we were 
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able to identify inconsistencies intertemporally. Drawinf on 
responses from 948 parents and considering only the ques¬ 
tions concerning age, race, birthdate, education, and smok¬ 
ing status, over 80% of the parents had matching responses 
over the two surveys. These results compart favorably to 
similar investigations of the VS. Census and other highly 
regarded date baMfc 1 * 17 . 

In the course of our examination of the quality of this data 
base we noted a number of recall-related problems inherent 
in the survey procedure. These problems may also be present 
in parts of other surveys, such as the Health Interview Sur¬ 
vey (HIS), that rely on biweekly interviews to collect acute 
respiratory disease data. These problems are discussed in 
some detail else where, 14 but the main points may be summa¬ 
rized as follows: 

1) Respondents have imperfect recall of the day or even 
week of onset of illness. For example, over 60% of illnesses 
reported were Mid to have occurred in the second week of 
the recall period, a result significantly (a ■ 0.01) different 
from the uniform distribution ons would expect Apparently 
respondents either forget (presumably minor) illnesses oc¬ 
curring in ths first week, or they remember disease onset as 
occurring later than waa actually the case. 

2) When average duration of reported illness is plotted as 
a function of day of onset during the two-week period, a 
linear decline is found during the second week of the period, 
with average duration at the end of the week barely half of 
average duration at the beginning. The most iiksly explana¬ 
tion is that illnesses extending past the end of the period art 
not reported accurately, even though interviewers were in¬ 
structed to identify such illnesses and ask about them at the 
end of the next reporting period. If this explanation is cor¬ 
rect, the truncation of restricted activity days imparts a 
downward bias to illness severity. 

3) In other panel studies it has been suggested that re¬ 
spondents may, over time, progressively under-report illness 
simply because they become tired of doing interviews. If 
pollution levels are time-dependent, the study results may 
be biased accordingly. We found little evidence of this phe¬ 
nomenon. On the assumption that less serious illnesses could 


Table I. Descriptive statistics (N ■ 2093). 


Variable 

Mass value or 

population fraction 

NEWILL 

0.13 

RADS 

0.21 

AGE 

7.7 

Aft distribution 

0-2 

0.09 

3-4 

0.08 

5-6 

0.16 

7-8 

0.23 

9-10 

0.25 

11-12 

0.19 

RACE1W 

0.91 

CHESTINF 

0.23 

CHRON 

0.07 

Education of household head 
High school graduate 

0.71 

Attend sd som* coUtgt 

0.45 

MOMHEAD 

04)6 

Mothers* smoking status 

Current 

0.32 

Ex- 

0.15 

Non- 

0.53 

CROWD 

1.30 

SEX1F 

0.48 

GAS 

0.05 

RAIN 

2.70 

TEMP 

18.20 


Table II. Pollution statistics. 



Mean 

(Mf/» 5 ) 

Standard 

deviation 

Correlation coefficient* 
PAR90P SUL90P TEMP 

N02MAX 

98.0 

48.3 

-0.10 

0.20 

-0.09 

PAR90P 

100.6 

31.4 


-0.027 

0.34 

SUL90P 

10.0 

2.7 



-0.036 


be more likely to be neglected, ire regressed the ratio of 
“serious” to total illness incidence on time, and found no 
trend. 

Thee# findings affected our subeequent data analysis in 
two ways. First, no attempt was made to use time intervals 
shorter than two weeks, even though the sample could be 
reduced to weekly or even daily observations. Second, we 
concentrated on the incidence of illness rather th™ dura¬ 
tion, inasmuch as we felt the former to be more reliable. 


The Model 


To identify the factors affecting reported children's dis¬ 
ease, we use pooled cross-section time series models predict¬ 
ing illness incidence or duration as a function of demograph¬ 
ic, pollution, and waather variables. Symbolically, the mod¬ 
els are of tha form 

Sijt m fiXij i P ftt W|) + ( ijt 

where 5^ is the reported incidence or duration of the ill¬ 
ness of the ith child in the )th neighborhood 
during period t, 

Xtj is a vector of personal variables for the ith child 
in the ;th neighborhood, 

P }t is a vector of pollution variables for the ;th 
neighborhood in period t, 

W t is the weather in period t, and 
is the disturbance term. 


The independent variables are defined as follows: 

AGE: the child's age at the beginning of the school 


RACE1W; 

CHESTINF: 

CHRON: 


EDU: 

MOMHEAD: 

SMKPPD. 

CROWD: 


SEX1F: 

GAS: 

RAIN: 


year. 

the race of the head of household; X if white, 0 
if nonwhite. 

1 if the child has suffered a respiratory infec¬ 
tion within the past three yean, 0 otherwise. 
1 if the child suffen from asthma or a chronic 
heart or lung condition, 0 otherwise, 
the yean of schooling completed by the head 
of household. 

I if the household bead is female, 0 otherwise, 
mother's smoking in packs per day. 
number of household members divided by 
the number of rooms in the house. 

•ex of child; X if female, 0 if male. 

1 if the kitchen stove is gas. 0 if electric, 
amount of rainfall during the period, in inch- 


EP IDEM- 
TEMP: 


N02MAX: 

PAR90P: 

SUL90P: 


monthly influenza cases reported by the 
State of Tenneeeee (in thousands), 
the absolute difference between ths average 
temperature during the period and 65*. 
average daily maximum concentration of 
NOj, in Mg/m*. 

90th percentile total suspended particulate 
concentration during the month, in Mg/® 3 . 
90th percentile sulfate concentration during 
the month, in Mg/m*. 


As noted above, two dependent variables are considered: 
illness incidence (NEWILL), which is 0 or 1 according to 
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Table III. Predicting illhet* incidence in population tubsample*. 1 


9 

A 

B 

c 

D 

Children 

E 

F 

G 

H 

I 


All 

Children 

Children 

without 







children 

with 

with 

chronic 

Children 

Children 





12 and 

nonsmoking 

smoking 

respiratory 

with 

(and 

Infant* 

Fall only 

Spring only 


under 

mothers 

mothers 

disease 

CRD 

under 

Intercept 

0.0512 

0.084 

-9.0197 

0.052 

0.098 

0.037 

0.151 

0.161 

0.018 

(00503) 

(0.059) 

(0.091) 

(0.056) 

(0.095) 

(0.10) 

(0.21) 

(0.113) 

<0.075t 

N02MAXL 

-S.87F-4 

-6.10E-4 

—13.5E-4 

-6.4E-4 

-I3.6E-4 

-8.44E-4 

-12.5E-4 

0.26E-4 

-15.0E-4 

(2.0SE-4)' 

(2.46E-4)* 

(3.71E-4P 

(2JE-4)' 

(4.0E-4)* 

(4.JE-4)* 

(8.7E-4) 

(4.3E-4) 

(2.9E-4)' 

N02MAXH 

I.71E-4 

2.19E-4 

-.57E-4 

2.3E-4 

& IE-4 

2.41E-4 

5.2E-4 

0.06E-4 

3.95E-4 


<0.68E-4) k 

(0.78E-4)* 

(1.46E-4) 

(1.13E-4)- 

(1.J4E-4) 

(1.89E-4) 

(i»E-4) b 

(0.78E-4) 

<2.0lE4> b 

PAR90P 

8.88E-4 

6.69E-4 

14.3E-4 

11.4E-4 

4.00E-4 

1UE4 

13.0E-4 

2S.1E-4 

134E-* 


(6.08E-4) 

(7.4E-4) 

U0.8E-4) 

(6.8E-4) 

U2E-4) 

(U.9E-4) 

(24E-4) 

(15E-4) 

(9.1E-4) 

PAR90P2 

-0.0ME-4 

—0.044E-4 

—0.075E-4 

—0.058E-4 

-0.04 IE-4 

-0.066E-4 

0.045E-4 

-0.146E-4 

—0.056E-4 


(0.028E-4)' 

(0.03E-4) 

(0.048E-4) 

(0.031E-4) 

(0.0555) 

(0.064) 

(0.11E-4) 

(0.074E-4) 

(0.039E-4) 

SUL90P 

135E-4 

H7E-4 

191E-4 

89.8E-4 

231E-4 

128E-4 

219E-4 

-156E-4 

95.3E-4 


(45.9E-4)' 

<54E-4>‘ 

(87E-4) k 

(53E-4) 

(87JE-4)* 

(B0E-4) 

098E-4) 

(217E-4) 

(65.7E-4) 

SUL90P2 

-5.54E-4 

-4.B1E-4 

-7J1E-4 

-3.67E-4 

—9.36E-4 

—6.46E-4 

—11.6E-4 

5.76E-4 

—3.38E-4 


(1.72E-4)* 

(2.0E-4) b 

(3.32E-4) b 

(157-4t 

(3.24E-4)* 

(3-3E-4) 

(7.4E-4) 

(9.9E-4) 

(2.2E-4) 

AGE 

-0.0185 

-0.0165 

-0.0252 

-0.022 

-0.010 

d 

• 

-0.0207 

-0.0164 


(0.00341* 

(0.0042 )* 

(0.0057)' 

(0.0037)' 

(0.0071) 



(0.0047)* 

(0.0048)* 

ACE2 

7.53E-4 

S.14E-4 

14.9E-4 

8.99E-4 

3.32E-4 

d 

• 

8.71E-4 

S.4E-4 


I2.44E-4)* 

(3.0E-4) 

(4.2E-4)* 

(2.66E-4)* 

(5.27 E*4) 



(3.47E-4)* 

(3.46E.4) 

CHEST1NF 

0.0475 

0 0495 

0.044 



0,071 

0.103 

0.046 

0.048 


(0.010)' 

(0.012)' 

(0.018)' 



(0.019)* 

(0.042) k 

(0.014)* 

(0.015)* 

CHRON 

0.044 

0.0390 

0.052 



0.036 

0.003 

0.036 

0.052 


(0.0059 ) r 

(00071)' 

(0.011)' 



(0.011)* 

(0.024) 

(0.0083)' 

(0.0084)* 

CROWD 

0.0184 

0.0163 

0.018 

0.019 

o.on 

0.037 

0.029 

0.022 

0.0156 


(0.0073 ) b 

(0.0090) 

(0.013) 

(0.0083)> 

(0.014) 

(0017)* 

(0.715) 

(0.010)* 

(0.010) 

EDU 

-0.0012 

-0.0045 

0.0056 

-0.0016 

-0.0060 

-0.0029 

-0.0031 

-0.00028 

-0.0024 


(0.0021) 

(0.0028) 

(0.0039) 

(0.0025) 

(0.00951 

(0.0046) 

(-0.315) 

(0.0031) 

(0.0032) 

EPIDEM 

0.072 

0.089 

0.036 

0.055 

0.099 

0.064 

0.059 

-0.23 

0.096 


(0.01 n* 

(0.013)' 

(0.019) 

(0.012)' 

(0.021)* 

(0.022)' 

(1.31) 

(0.085)' 

(0.017)* 

SEX IF 

0 0076 

0.0083 

0.006 

0.0066 

0.011 

-0.0080 

0.0027 

0.0021 

0.012 


(0.0052) 

(0.0063) 

(0.0092) 

(0.0058) 

(0.010) 

(0.011) 

(0.023) 

(0.0072) 

(0.0071) 

SMKPPD 

-0.0013 


0.0019 

-0.0027 

0.0021 

-0.0050 

-0.0058 

0.00126 

-0.0037 


(0.0020) 


(0.0049) 

(0.0023) 

(0.0036) 

(0.0040) 

(00086) 

(0.0028) 

(0.0028) 

GAS 

-0.020 

-0.044 

0.012 

0.0044 

-0.061 

-0.057 

-0.070 

-0.0012 

-0.036 


(0.012) 

(0.015)' 

(0.019) 

(0.014) 

(0.021 ) e 

(0.026)* 

(0.061) 

(0.016) 

(0.016>> 

RAIN 

-0.0056 

-0.0068 

-0.0027 

-0.0050 

-0.0060 

-0.0054 

0.0019 

-0.019 

-0.0027 


(0.0016X 

(0.0019)' 

(0.0027) 

(0.0017)' 

(0.0031) 

(0.0032) 

(0.0064) 

(0.0053)* 

(0.0020) 

TEMP 

0.0022 

0.0021 

0.0025 

0.0020 

0.0026 

0.0029 

0.0015 

0.0018 

0.0026 


(0.00040) 1 

(0.00048)' 

(0.00072)' 

(0.00045)' 

(0.00081)' 

(0.00082)' 

(0.0018) 

(0.00086 )* 

(0.00066)' 

RACE! W 

0.056 

0.039 

0.073 

0.051 

0.060 

0.059 

0.087 

0.034 

0,075 


(0.0090)' 

(0.012)' 

(014)' 

(0.0095)' 

(0.019)* 

(0.017)' 

(0.039) b 

(0.013)' 

(0.013)' 

S 

16474 

11497 

4977 

11557 

5246 

5108 

1387 

8176 

6298 

F 

25.5 

19.7 

8.83 

15.29 

8.33 

5.68 

1.84 

8.37 

22.5 

& 

0.0286 

0.030 

0.033 

0.022 

0.026 

0.027 

0.025 

0.019 

0.049 


* Standard error* in parentheses. 

* Significant at the 5% level. 

' Significant at the 1* Ifcvtl. 

* AGE and AGE2 were replaced by dummy variable* AGEONE <- 1 if AGE ■ 1,0 otherwise) and AGETWO. 
•AGE and AGE2 were replaced by dummy variables AGEONE through AGESIX. 


whether the child is reported ill during the two-week period 
in question, end duration of restricted activity (RADS), 
which takes an integer value between 0 and 14. 

Tables I and II provide descriptive statistics on these 
variables. Note the low number for mean RADS, indicating 
the large percentage of observations with a xero valise for thia 
variable. Correlation coefficients between each of the pollut¬ 
ant* and temperature are also provided. Note that correla¬ 
tions between pollutants are all quite low. We searched for 
more complicated pattern* of collinearity by using the diag¬ 
nostic tests 11 provided with the SAS regression package. 
These tests failed to reveal any serious collinearity problems 
involving any of the independent variable*. 

We relied primarily on a linear probability model for our 
analysis, using ordinary least squares (OLS) as the estima¬ 
tion procedure, the result* of which art presented below. 
However, the OLS model requires a number of assumptions 
of questionable validity for the current problem. We diacua* 
later the effecta of these assumptions on the outcomes. 
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Results 

Table III shows the results of the regressions predicting 
illness incidence. Column A gives the results for the entire 
sample of children aged 12 and under. The remaining col¬ 
umns show results for a number of subpopulations; we used 
these results to examine the stability of the coefficients and 
to identify populations especially sensitive to the pollution 
variables. Thus, Columns B and C give results for children of 
mothers who do and do not smoke, and Columns D and E 
give results for children with and without chronic respira¬ 
tory disease or • history of respiratory ailments. In Columns 
F and G we examine the illness incidence in younger chil¬ 
dren. Finally, in Columns H and I we divide the sample into 
fall (October-December 1972) and spring (January-April 
1973) time periods. 

The specification of the NOj variable was piecewise linear, 
with a break at 100 Mg/m 3 - This specification was the best of 
all those examined. In Table III N02MAXL and NO- 
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2 MAXH refer, respectively, to average daily maximum con¬ 
centration! below and above 100 Mg/m 3 . Aa ahown, for NO- 
2MAX concentration! below 100 Mg/m 3 , illneaa probability 
falla relatively sharply aa N02MAX increases. Above 100 
Mf/m 3 illneaa probability gradually increase! with NO- 
2 MAX, but the illnma rata at the highest observed two-week 
maximum concentration (384 Mg/m 3 ) is lew than that at the 
lowest observed concentration (27 Mg/m 3 ). The clinical liter¬ 
ature gives no reason why a doee-reaponae function should 
have these characteristics. 

In the subeamplet (Columns 8 to 1), the above relation¬ 
ship between N02MAX and illness is replicated for children 
with nonsmoking mothers and children without a history of 
respiratory disease. For both preschool age children and 
infanta, the coefficients were similar although not always 
significant. However, for children whose mothers smoke or 
have a chronic respiratory condition, N02MAXH has virtu¬ 
ally no affect Finally, when only fall periods art examined, 
N02MAX is not related to illness at ail. 

These exceptions did not increase our confidence in the 
results. The first two exceptions suggest that a “sensitive 
population** for NO? is healthy older children of nonsmoking 
mothers. If so, perhaps the presence of a chronic condition 
swamps the small NO? effect Likewise, perhaps for children 
exposed to parental smoking an additional NO? effect can¬ 
not be detected. One problem with this explanation is that 
we found no adverse effect of mother's smoking on children's 
health. As for the absence of an NO? effect in the fall we note 
that the prevalence of illness in that season was relatively 
low in any event, if the effect of NO2 is to reduce resistance 
to disease, we might expect to find no NO? effect when little 
disease is present in the community. Such speculation not¬ 
withstanding, we have not found an effect from NO? that is 
supported by clinical evidence or that is present in all popu¬ 
lation subgroups. 

For sulfates and particulates the best fits were obtained 
for the 90th percentile of two-week concentration and a 
quadratic specification, with a positive linear and negative 
square term. The paniculate results were reasonably consis¬ 


tent across subpopulations, but rarely significant at the 5% 
level Moreover, the various functions were such that the 
effects of particulates on illness were negative at concentra¬ 
tions below 80*100 jtg/m 3 , which is near the average 90th 
percentile concentration. That is, over much of the relevant 
range the particulate variable is inversely related to illness. 

The coefficients for sulfates are significant for the entire 
sample, but not for the fall and spring semesters separately. 
For fail, the coefficients enter with signs and reverse of ail 
other subeamplet, but the t-values are very small. For 
spring, the coefficients are similar to coefficients in other 
aquations but the t -values still art not significant. For the 
population subeamplet the sulfate coefficients are aubie 
and significant except for infants, where, as we have noted, 
sample sixes are much smaller and an effect of population on 
health would be correspondingly more difficult to identify. 
The inconsistent s e asona l results may be related to using 
weighted averages of monthly summaries of daily readings 
instead of two-week averages, which were unavailable for 
particulates and sulfates. 

Turning briefly to the other explanatory variables, the 
moat statistically significant and robust results were for vari¬ 
ables that one would expect to be associated with respiratory 
disease: age, a history of cheat infection, presence of a chron¬ 
ic condition, the extant of crowding in the homeland outside 
temperature. Not only were these variables almost always 
significant, but the coefficients were stable across subpopu- 
lations. The coefficients for CHRON (presence of chronic 
disease), for example, varied between 0.036 and 0.052. ex¬ 
cept in the equation for infants, and indeed very few infants 
in the sample were diagnosed for a chronic disease. The 
EPIDCM variable was also generally significant but in the 
fall the sign wai negative, a result we believe to be fortuitous, 
inasmuch as the variable was very small in absolute value 
during that season. 

For two other variables. RACE1W and RAIN, the results 
were stable and significant. White children consistently re- 
ported more new illness than nonwhites. We also found a 
consistent inverse relationship between the amount of rain- 


Table IV. Comparison of specifications of NO: variablH in equations predicting illness incidence^ 



A 

B 

C 

D 

E 

F 

G 

N02MAX 

1.4E-6 

-3.5E-4 

—21.7E-4 






(56.8E-61 

(1.82E-4)* 

(4.37E-4)* 





N02MAX2 


0.009E-4 

11.3E-6 







(0.0045E-4) 

(2.31E-6)' 





N02MAX3 



-li59E-8 

(0.35E-8)* 





N02MAX(0-75) 




-19E-4 

- 21 E -4 







(4.63E-4** 

<4.88E-4)‘ 



N02MAX (75-150) 




-0.16E-4 

(0.94E-4) 




N02MAX(75-100> 





—0.2BE-4 
(0.80E-4) 



NO2MAXU00-150) 





1.2E-4 

(l.lE-4) 



N02MAX0150) 




1.7E-4 

0.56E-4 







U.15E-4) 

(1.19E-4) 



NO2MAXUM00) 






-8.9E-4 

(2.05E-41* 


N02MAX0100) 






1.7E-4 

(0.69E-41* 


N02AVG(0-50) 







-9.4E-4 

<4.1E«4>* 

N02AVG050) 







1.96E-4 
(3 8E-4) 

F 

25.8 

24,7 

245 

24.0 

23.2 

25.5 

24.7 

R* 

0.0274 

0.0277 

0.0289 

0.0281 

0.0288 

0.0286 

0.027B 

S 

16474 

16474 

16474 

16474 

16474 

16474 

16474 


* Standard errors in parentheses. 

* Significant at tha 5% leveL 

* Significant at the 1% level 
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Table V. Craw section time lenn ref r«Miun *-*...*.... 

* illness incidence end restricted Activity day*: full sample v*. 
sample consisting of one child per family .* 


Variable 

Illness incidence i 

Restricted activity devt 

— rar - 

umptr 

sample 

Full 

sample 

Sub- 

sample 

Intercept 

0.0512 

-0.0016 

0.348 

0.275 


(0.06) 

(0.73) 

(0.135)* 

(0.20) 

N02MAXL 

-8.87E-4 

-7J8E-4 

-25.0E-4 

-28.7E-4 


(2.05E-4)* 

(2.9E-4)* 

(6.6E-4)' 

(8.3E-4)' 

N02MAXH 

1.71E-4 

1.39E-4 

4.4E-4 

6.22E-4 


(0.68E-4I* 

(0.99E-4) 

(1.86E-4)* 

(2.8E-4)* 

PAR90P 

8.88E-4 

17.BE-4 

-1B3E-4 

9.80E4 


(6.08E-4) 

(8.8E-4)* 

(16.6E-4) 

(24.5E-4) 

PAR90P2 

-0.053E-4 

-0.068E-4 

0.064E-4 

-0.04E-4 


(0.028E-4) 

(0.040E-4)* (0.075E-4) 

(0.099) 

SUL90P 

13SE-4 

107E^ 

212E-4 

73.3E-4 


(45.9E-4)' 

(66E-4) 

(126E-4) 

(lSSE-4) 

SUL90P2 

-S.54E-4 

-4.73E^ 

—0.3E-4 

—4.26E-4 


(1.72E-4)* 

(2.5E-4) 

(4.7E-4)* 

(6.9E-4) 

AGE 

-0.0185 

-0.020 

-0.045 

-0.044 


(0.0034)' 

(0.0049)' 

(0-0092)* 

(0.014P 

AGE* 

7.S3E-4 

8.54E-4 

20.9E-4 

19.1E-4 


(2.44E-4)* 

(3.6E-4)* 

(6.7E-4)* 

oo.oe-4) 

CHESTINF 

0.0475 

0.053 

0.178 

0.196 


(0.010)' 

(0.015)* 

(0.028)* 

(0.042)* 

CHRON 

0.044 

0.047 

0.099 

o.no 


(0 0059 ) c 

(0.0084)* 

(0.016)' 

(0.023)* 

CROWD 

0.0184 

0.018 

0.019 

0.0080 


(0.00731* 

(0.010) 

(0,020) 

(0.026) 

EDI) 

—0.0012 

0.0028 

-0.0087 

-0.0023 


(0.0021): 

(0.0032) 

(0.0061) 

(0.0088) 

EPIDEM 

0.072 

0.080 

0.228 

0.221 


<o.ou>* 

(0.015)' 

(0.029 J' 

(0.042)' 

SMKPPD 

-0.0013 

-0.0039 

0.0004 

0.0086 


(0.0020) 

<0.00281 

(0.0050) 

(0.0078) 

GAS 

-0.020 

-0.0021 

-0.037 

0.0021 


(0.012) 

(0.0016) 

(0.032) 

(0.045) 

RAIN 

-0.0056 

-0.0049 

-0.0037 

0.0025 


(0.0016)' 

(0.0022)* 

(0.0043) 

(0.0061) 

TEMP 

0.0022 

0.0020 

0.0029 

0.0030 


(0.0004)' 

(0.00057)* 

(0.0011)* 

(0.0016) 

RACEiW 

0.056 

0.06$ 

0.127 

0.136 


(0.0090)' 

(0.014)' 

(0.025)* 

(0.038>* 

SEX I F 

0.0076 

0.0094 

0.023 

-0.0039 


(0.0052) 

(0.0074) 

(0.014) 

(0.021) 

.V 

16474 

6158 

16474 

8156 

F 

25.5 

13.6 

20.5 

106 

ft’ 

0.0286 

0.031 

0.023 

0.024 


* Standard error* in parsnthese*. 

* Significant at the 5% level: 

* Significant at the \% level. 


fall and the incidence of illness in • two-week period, al¬ 
though the magnitude varied by a factor of six between fall 
and spring. Again, we have no explanation for this result. In 
addition, the presence of a gas stove in the bouse appeared to 
be unrelated to disease incidence. Few significant results 
were obtained, and for those that were significant the sign 
was contrary to expectation. As only 5% of the households 
cooked with gas. thine generally inconsistent results are not 
particularly surprising. 

Other covariates had virtually no explanatory power, and 
were rather unstable across subpopulation*: educational lev- 
jl of bead of household, sex, and mother's smoking status. In 
particular, we found that a mother's smoking in the home 
tras unrelated to acute respiratory disease incidence of her 
children. However, this should not be too surprising in view 
r *i the contradictory findings oo the health effects of passive 
x smoking. 1 * 

An analysis was also carried out for illness duration as the 
dependent variable. In this case the dependent variable S lJt 
took an integer value between 0 and 14. OLS estimates 
predicting illness duration were very similar to the results 
presented above; that is, independent variables that were 


also sign meant r . 

However, OLS estimates of truncated variables are inconsis¬ 
tent as well as inefficient, 30 so it is especially important to 
compare the results to those of a more suitable estimation 
procedure. Thus, illness duration was also investigated using 
Poisson regression, and the comparison between Poisson 
and OLS is discussed below. 

Some Problems of Estimation 

A pervasive problem in the estimation of the effects of sir 
pollution on illness is that information on personal exposure 
to pollutants is rarely available. Researchers have been 
obliged to use ambient monitoring data as a proxy for per¬ 
sonal exposure, and our study is no exception to this rule 
Nonetheless, every child in our sample lived and attended 
school within s mile of s monitoring site, s relatively tight 
radius compared to most similar studies. 

Besides this measurement difficulty, there were several 
major econometric problems. These problems arose primari¬ 
ly from our desire to use a linear probability model and OLS 
as the principal estimation procedure. Convenient though it 
may be, the OLS model requires a number of assumptions of 
questionable validity for the current problem. The question 
we now examine is whether these refinements make much 
difference to outcomes. 

The first problem is that the dependent variable S (/ , is 
limited to the values 0 or I (for illness incidence) or to the 
■mall positive integers (for illness duration). Thus, the OLS 
estimators are not efficient, and the linear probability model 
may not be appropriate in any event. 

A second problem is concerned with the functional form of 
the relationship between illness and air pollution (indeed, 
between illness and any explanatory variable). As there is no 
theory to guide the selection of functional form, we chose a 
functional form on the basis of an information criterion 
proposed by Sewa. 31 

The third problem involves the structure of the distur¬ 
bance term We examined two alternatives to the OLS 
assumption of uncorrelated disturbances: 

Autoregrenion: an individuals health status in one period 
may affect his or her health status in subsequent periods, in 
which case w 0 for t w t\ 

Contagion: one s health may be affected by the health of 
others, especially family members and classmates, in which 
case E(t j; ic l ;-|) w 0 for i s* i* or; * j\ 

These problems were examined sequentially. First, sever¬ 
al alternative functional forms were examined. Having se¬ 
lected a functional form, we then examined the error struc¬ 
ture. Finally, alternative estimation procedures more suited 
to limited dependent variables were investigated. 

nmnnn r win 

Table IV shows the relationship between illness incidence 
and NO} for several different specifications of the pollution 
variable. The basic variable was N02MAX, the daily maxi¬ 
mum NO} reading, averaged over the two-week period (Not 
shown are specifications using average pollution variables, 
which give results inferior to the ones for N02MAX.) 

The specifications examined include the following: 

• linear specification 

• quadratic 

• cubic 

• piecewise linear functions with one break point at 100 
Mg/m 3 , two break points at 75-and 150 Mg/m 3 , and three 
break points at 75,100 and 150 Mg/® 3 . 

In all specifications, except the linear, the relationship 
between NOj and illness incidence is U-shaped. Based on 
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TabU VI. Comparison of logit and OLS model* predicting illness incidence. 


OLS_ _ Logit 


Coefficient 

Std. error 

Coefficient 

Std. error 

as 





dx 


Intercept 

0.176 

0029* 

-1.80 

0.256 


N02MAXL 

-8.98E-4 

2.04E-4* 

-0.0067 

0.00174* 

-7:63E-4* 

N02MAXH 

1.61E-4 

0.68E-4* 

0.00135 

0.000595* 

1.53E-4* 

PAR90P 

-2.63E-4 

0.923E-4* 

-0.00187 

0.00082* 

-2.13E-4* 

SUL90P 

2.55E-4 

12.1E-4 

-0.00138 

0.0106 

—1.57E-4 

ACE 

-0.0062 

0.00085* 

-0.0715 

0.0074* 

-0.00814* 

CROWD 

0.150 

0.0071* 

0.151 

0.064* 

0.0172* 

EPIDEM 

0.074 

0 .010* 

0.537 

0.067* 

0.061* 

CAS 

-0.020 

0.012 

-0.187 

0.110 

-0.020 

RAIN 

-0.0063 

0.0016* 

-0.051 

0.015* 

-0.0058* 

TEMP 

0.001S4 

0.00039* 

0.0167 

0.00356* 

0.00190* 

RACE1W 

0.0540 

0.0088* 

0.542 

0.093* 

0.0526* 

CHRON 

0.0416 

0.0059* 

0.350 

0.0450* 

0.0423* 

CHESTINF 

0.0460 

0 .0102* 

0.342 

0.081* 

0.0434* 


* Significant at tha 5% leveL 

* Significant at tht 1% level 


tht BIC criterion proposed by Sawa. 21 the beet performer i» 
the piecewise linear specification with a break point at 100 

Mg/m 3 . 

We also tested these spline specifications against the qua¬ 
dratic specification using one of the tests described by Da- 
videonand MacKinnon 23 for non ^nested models. The result 
of this teat was as follows: When the quadratic specification 
was taken as the null hypothesis egainst the piecewise linear 
alternative, the null hypothesis was rejected. However, with 
the spline taken as the null hypothesis the null could not be 
rejected. Thus, the spline specification with a break point at 
100 Mg/m 3 fit the data best, and this was used in subsequent 
work. 

irrer Sfrvcture 

To examine the effect of possible serial correlation we 
assumed a first-order autocorrelation scheme and used a 
two-stage procedure described by Kmenta. 23 First we esti¬ 
mated the autocorrelation parameter p using OLS, and then 
reestimated the model 

( y ( - i v t -i> - (x, - iX'-w + <«, - 

Our estimate for p was p • 0.006. In the second stage, we 
found the following results for the NO2 variables, which, it 
will be noted, are essentially the same as Column H of Table 
III: 

S--9.35E-4 N02MAXL 

(2.12E-4) + 1.76E-4 N02MAXH + other terms 

(0.81E-4) 


with n * 14907 and F ■ 25.1 for the equation. This result 
indicated that the problem of autocorrelation could be ig¬ 
nored. 

Contagion presented a problem that we were not able to 
resolve fully, due to a lack of complete information on all the 
physical contact! among the various members of the sample. 
However, we were able to examine contagion in the home, 
one of the most likely places where diseases may be spread. 

If contagion in the home is present, the estimated effect on 
incidence and duration of variables common to members of a 
family, such as their exposure to air pollutants, will exceed 
the true effect. To test for this possibility we compared 
regression results from the full sample to the results from a 
subset consisting of one child chosen randomly from each 
family represented in the sample (Table V). Although the 
standard errors on the former are a bit larger, (which is what 
one would expect from the reduction in sample size), the 
coefficients are quite similar. Thus, the results are probably 
not much affected by spread of disease in the home. 


limited Dependent Variables 

In this section we examine whether the results depend on 
our use of OLS rather than techniques more suited to limited 
dependent variables. Specifically, we tested the linear prob¬ 
ability model against a logit model for predicting illness 
incidence, using the "C** test described by Davidson and 
MacKinnon. 22 This test showed the logit model to be superi¬ 
or in the following sense: When the null hypothesis H 0 is the 


Table VII. Comparison of Poisson and OLS modtb predicting illness duration. 


OLS_ _Poisson 


Coefficient 

Std. error 

Coefficient 

Std. error 

as 





dx 


Intercept 

0.352 

0.079* 

-1.62 

0.361* 


N02MAXL 

-0.00249 

0.00056* 

-0.0083 

0.0023* 

-0.00173* 

N02MAXH 

0.00043 

0.000186* 

0.00195 

0.00084* 

0.00041* 

PAR90P 

-0.0047 

0.00025 

-0.00101 

0.00120 

-0.0021 

SUL90P 

-0.0026 

0.0033 

-0.0120 

0.016 

-0.0025 

AGE 

-0.0174 

0.0023* 

-0.078 

0 .0102* 

-0.0162* 

CROWD 

0.0089 

0.0193 

0.075 

0.092 

0.0156 

EPIDEM 

0.216 

0.028* 

0.768 

0.117* 

0.160* 

GAS 

-0.032 

0.031 

-0.155 

0.158 

-0.030 

RAIN 

-0.0044 

0.0042 

0.0022 

0.0195 

0.00046 

TEMP 

0.0030 

0 .0011* 

0.0152 

0.0052* 

0.0032* 

RACE1W 

0.119 

0.024* 

0.688 

0.148* 

0.114* 

CHRON 

0.091 

0.018* 

0.399 

0.069* 

0.093* 

CHESTINF 

0.178 

0,028* 

0.571 

0.097* 

0.155* 


* Significant at the 5% level 
k Significant at the 1% level 
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OLS mode) and the logit mode! is Hi, Ho is rejected; however, 
•when Uie roles are reversed and Ho « the logit model. Ho 
cannot be rejected. , ' 

For illnew duration a similar comparison was made be¬ 
tween OLS and Poisson regression. Again, the OLS model 
was found to be inferior, hkmetbeless, the coefficients on the 
independent varisbles attimated using OLS were very simi¬ 
lar to the corresponding coefficients for the logit and Poisson 
models. These coefficients are compared in Tables VI and 
VII. To facilitate comparison, the rightmost column of each 
table is the derivative of the dependent variable of the Pois¬ 
son or logit function, evaluated at the mean of the dependent 
variable. (For discrete independent variables the entry is the 
average change in probability of illness, estimated by the 
weighted sum of the change in probability when the variable 
it added at the mean and when it is taken away.) Even 
though OLS appeared slightly inferior to logit and Poisson 
regression in predicting illness incidence and duration, the 
qualitative results were hardly affected. 


Conduaiofli 

A CHESS data base from Chattanooga, Tennessee was 
thoroughly scrutinized snd found to be of high enough quali¬ 
ty to warrant epidemiological analysis. Using this data base, 
the relationship between N0 2 ambient pollution levels and 
acute respiratory disease in children was examined. Al¬ 
though a statistically significant relationship was found, it 
was not monotonic. Indeed, over the range of pollution val¬ 
ues experienced. more illness is associated with low pollution 
values than with high ones. A U-shaped relationship be¬ 
tween illness and NO? concentrations was found in several 
subpopulations in addition to the entire data set, although 
for some subpopulhtions no relationship was found. As far as 
we know, there is no clinical explanation for this result. In 
contrast, higher ambient sulfate levels were found to have a 
positive effect on acute respiratory disease incidence in chil¬ 
dren over the entire period and for different subsamples, 
although this effect was not significant for either season 
analyzed separately. 

The strange relationship between NO? concentrations and 
ARD in children could be attributable to three problems 
inherent in any epidemiological study. First, the relation¬ 
ship could be entirely fortuitous, although the odds against 
this for our study are long. Second, both illness and NO? 
could be related to some unobserved variable. However, such 
a variable must have strange properties, because for certain 
well-defined subsets, its relationship to either illness or NO? 
changes substantially. Finally, the data could still contain 
biases that create the observed effects. 

In short, there is reason to be skeptical of a U-shaped 
dose-response function relating ambient NO? levels and 
acute respiratory disease. Nonetheless, we suggest that non¬ 
monotonic doee-responee functions he explicitly considered 
in future epidemiological or clinical research on the health 
affects of NO? and perhaps other pollutants as well. 
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